A Logical Unit (LU or Logical Unit type - LUT), also known as a Data Product, is a blueprint data asset, engineered to deliver a trusted dataset for a specific business purpose (a Business Entity like customer, order, or loan). It holds a set of definitions and instructions used for integrating data from source systems, processing and governing the data, storing, and delivering it. The LU is the prototype from which LU Instances (LUIs) are created.
An LU is defined and configured in the Fabric Studio as a core element of the Fabric project. These definitions are comprised of 3 main types of objects:
LU Table: The definition of a table within the LU Schema, with its columns, primary keys, indexes, and triggers.
LU Schema: The relationship between the LU tables (similar to foreign keys). An LU schema has one LU table defined as its Root Table. The Root Table holds the LU’s unique key.
In addition to these main objects, there are some others that are a part of the logical unit, and they are used for defining its life cycle. They can be found in the Project Tree, under each logical unit:
Instance Groups
Resources - files that can be saved as part of a project
IIDFinder
Parsers
Jobs
Let’s use an example of a Customer 360 implementation for Company ABC:
A Logical Unit Instance is one instance of a Logical Unit Type – it is a single physical database, which holds the data of one single Business Entity instance in the LUT structure definition. Using our example from above (Customer 360), assume that Company ABC has 35 million customers:
Fabric will hold 35 million instances (LUIs) of the Customer LUT. That is, one physical database for each customer.
Every Fabric project starts by defining its LUs. Analyze the business requirements and understand how the consuming application will use the data. Use this information to define the different Business Entities to implement and build an LU for each Business Entity.
Business Entity is often split between different data sources. In some cases, it is preferable to create one LU that contains all data sources. In other cases, it is more advantageous to split the LUs and create a separate LU for each data source.
In general, an LU should be based on the smallest number of data sources, as long as it represents a full view of a Data Product.
For example, if you have a Data Product called Customer, but different Customer Types (e.g. consumer and business) have different data sources, the recommended approach will be to create an LU for each subtype (in our example, different Customer Types).
Below is a pros and cons table of each alternative:
Note:
The file name's ambiguity is not supported within the same Logical Unit. This is not restricted by the Fabric Studio on purpose, allowing the implementor to continue the work and to update the names later. For example, if 2 Java function files with identical names were exported from other projects or libraries, they can be saved in the project in the Fabric Studio.
However, at run-time there should be no ambiguity within the LU, otherwise the server will run the first file it finds (with no commitment as to what is considered the first one).
A Logical Unit (LU or Logical Unit type - LUT), also known as a Data Product, is a blueprint data asset, engineered to deliver a trusted dataset for a specific business purpose (a Business Entity like customer, order, or loan). It holds a set of definitions and instructions used for integrating data from source systems, processing and governing the data, storing, and delivering it. The LU is the prototype from which LU Instances (LUIs) are created.
An LU is defined and configured in the Fabric Studio as a core element of the Fabric project. These definitions are comprised of 3 main types of objects:
LU Table: The definition of a table within the LU Schema, with its columns, primary keys, indexes, and triggers.
LU Schema: The relationship between the LU tables (similar to foreign keys). An LU schema has one LU table defined as its Root Table. The Root Table holds the LU’s unique key.
In addition to these main objects, there are some others that are a part of the logical unit, and they are used for defining its life cycle. They can be found in the Project Tree, under each logical unit:
Instance Groups
Resources - files that can be saved as part of a project
IIDFinder
Parsers
Jobs
Let’s use an example of a Customer 360 implementation for Company ABC:
A Logical Unit Instance is one instance of a Logical Unit Type – it is a single physical database, which holds the data of one single Business Entity instance in the LUT structure definition. Using our example from above (Customer 360), assume that Company ABC has 35 million customers:
Fabric will hold 35 million instances (LUIs) of the Customer LUT. That is, one physical database for each customer.
Every Fabric project starts by defining its LUs. Analyze the business requirements and understand how the consuming application will use the data. Use this information to define the different Business Entities to implement and build an LU for each Business Entity.
Business Entity is often split between different data sources. In some cases, it is preferable to create one LU that contains all data sources. In other cases, it is more advantageous to split the LUs and create a separate LU for each data source.
In general, an LU should be based on the smallest number of data sources, as long as it represents a full view of a Data Product.
For example, if you have a Data Product called Customer, but different Customer Types (e.g. consumer and business) have different data sources, the recommended approach will be to create an LU for each subtype (in our example, different Customer Types).
Below is a pros and cons table of each alternative:
Note:
The file name's ambiguity is not supported within the same Logical Unit. This is not restricted by the Fabric Studio on purpose, allowing the implementor to continue the work and to update the names later. For example, if 2 Java function files with identical names were exported from other projects or libraries, they can be saved in the project in the Fabric Studio.
However, at run-time there should be no ambiguity within the LU, otherwise the server will run the first file it finds (with no commitment as to what is considered the first one).